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Abstract 

We wish to formally test for changes in the taxonomic diversity of a community, especially in 
the presence of high latent diversity. Drawing on the meta-analysis literature, we construct 
a model for diversity that accounts for covariate effects as well as sampling variability. This 
permits inference for changes in richness with covariates and also a test for homogeneity. We 
argue that we can use the principles of shrinkage estimation to improve richness estimation 
in this nonstandard context, which is especially important given the high variance of richness 
estimators and the increasing abundance of community composition data. We demonstrate 
the methodology under simulation, in a gut microbiome study (testing for a decrease in rich¬ 
ness with antibiotics), and in a soil microbiome study (testing for homogeneity of replicates). 
We believe that this is the first formal procedure for analyzing changes in species richness. 

Introduction 

The taxonomic diversity of a biological population is commonly used as a marker for ecosys¬ 
tem health. However, taxonomic diversity, or species richness, is sensitive to changes in the 
ecosystem, for example, temperature, time, biogeochemical conditions, and anthropogenic 
factors. Understanding the effects of mechanisms that may incite or accelerate changes 
in diversity is crucial to sustaining ecosystem balance. For this reason, many micro- and 
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macroecologists are interested in testing for statistically significant changes in diversity in 
response to one or more covariates. Such inference informs conclusions about sustaining 
the health of the natural ecosystem, whether terrestrial, aquatic, or even within the human 
body. 

Modelling diversity is more complex than may hrst appear. This is because sampling from 
an environment, for instance, the microbial population in a patch of soil, or butterflies in a 
rainforest, is almost always inexhaustive. As a result, any diversity analysis based solely on 
the observed sample fails to account for the unobserved diversity - the taxa in the population 
that eluded the sample. In ecosystems where the total diversity is low relative to observed 
diversity, unobserved species may not greatly threaten the validity of conclusions drawn 
that only consider observed diversity. However, in high-diversity ecosystems, unobserved 
species may comprise the bulk of the total number of species, and only a small fraction of 
the total number of species may be believed to be observed in the sample. The latter case 
is especially prevalent in skin, water and soil microbiome studies, because these ecosystems 
are characterised by large numbers of very rare species. 

In this paper we propose a method for modelling species richness that considers both the 
observed and unobserved members of the population. This allows us to draw conclusions 
about the population under study, rather than merely conclusions about the samples that 
were observed. The number of unobserved members of the population is unknown and must 
be estimated, and the error in estimation must be accounted for in the diversity model. The 
resulting procedure permits formal statistical inference about which covariates affect total 
diversity. In addition, the procedure admits a natural test for homogeneity of taxonomic 
richness of samples. Homogeneity is of particular importance because the experimentalist is 
often interested in the consistency of results across biological replicates. 

A brief overview of the paper is as follows: we begin by discussing some recent published 
work in the ecological literature which examined diversity as a function of covariates, and 
discuss the techniques used by these authors in making their conclusions. We argue that 
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these methods do not account for unobserved diversity. We then introduce our model for 
species richness and discuss parameter estimation. We show some simulation results justi¬ 
fying model assumptions and conclude with an examination of the dataset of Dethlefsen et. 
al, where we conclude that an antibiotic signihcantly decreases gut microflora diversity; and 
the dataset of Whitman et. al, where we conclude homogeneity of biological replicates in 
the soil microbiome. 


Existing methods for diversity quantification 

Interest in modelling changes in species richness spans a broad array of ecological helds. 
The effect of climate change on species richness has been of particular recent interest 
as has the link between human microbiome diversity and health [^[^. A survey of recent 
papers examining changes in diversity revealed that the most common technique used to 
conclude that diversity changes with a covariate is to £t a linear or nonlinear regression 
model for observed richness as a function of the covariate (^|3]-[^. If the regression coefficient 
of the covariate is statistically signihcant, then a relationship is concluded. Usually such 
conclusions are accompanied with a plot of observed diversity against the covariate and 
comments justifying the choice of model; for example, linear versus nonlinear. 

Another common approach to quantifying diversity is to model a sample diversity index, 
the most common of which are the Simpson index and the Shannon index. While use of 
diversity indices is very popular in the ecological literature, there are numerous problems 
associated with the manner in which they are employed, the most serious of which are 
estimation and confounding. For example, the population Simpson diversity is 

c 

HSimpson ^ ^ Pi 

i=l 

where pi is the true probability of observing the Ah taxa and C is the true species richness. 
However, since usually only a subset (sample) of the population is observed, the probabilities 
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Pi are unknown and must be estimated. While Pi = nijn (where Uj is the number of times 
the ith taxa is observed in a sample of size n) is a statistically optimal estimator of pi under 
the assumption of sampling independence (it has minimum variance out of all unbiased 
estimators), it is not the case that the plug-in estimator of Hsimpson is statistically optimal, 
where the plug-in estimator is dehned to be 

C 

HSimpson,p\ug-in ^ ^ Pi 5 

i=l 

where is c the observed species richness. Indeed, the minimum variance unbiased estimator 
was derived only very recently [^. The other issue with diversity indices is their interpre¬ 
tation. Smaller values of the Simpson index are associated with a larger number of classes 
in the population, and also with more inequality in their proportions. Larger values of the 
Shannon index are associated with a larger number of classes, but with more equality in 
their proportions. For this reason, it is not possible to conclude whether a population is 
very diverse, or alternatively very even, based solely on the information provided by the 
value of a diversity index. In this way, diversity indices confound the effects of evenness and 
richness, resulting in limited interpretation of their values as summaries of the sample. For 
these reasons we discourage the use of diversity indices in the ecological literature and focus 
discussion of this manuscript on modelling total species richness. 

We now propose a technique for modelling species richness in a way that accounts for 
the estimation of unobserved diversity. We note here that if a practitioner was interested in 
modelling diversity indices, they could proceed in the same way - that is, by taking account 
of the error in their estimation using the standard errors derived in and utilizing them as 
in the introduced model for species richness. 
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A model for total species richness 


Suppose we are interested in comparing the total species richness across m populations. 
Denote the total richness in the ith. population, observed and unobserved, by Cj, i = 1,..., m. 
Also associated with each population is a set of p covariates, often called “metadata”. We 
assume that species richness a function of the covariates, but also a function of pure random 
variation, so that 

Ci = /do + /dia^i,i + ... + j^pXi^p + Mj, 

where Xi^ is the value of the jth covariate for the ith population, is its coefficient (j = 
1,... ,p), and Ui is a random variable representing the variation in richness not attributable 
to the covariates. In matrix terms. 


C = X/3 + f/. 


where C = [Ci - ■ ■ Cm\^, (3 = [/3o ■ ■ ■and X = [Ixi-'-Xp] with 

— [^^, 3 ''' ^rn,jY ■ We make the assumption that are independent, identi¬ 

cally distributed normal random variables with common variance cr^: U iV(0,aX). We 
assume that we have complete information about the covariates associated with each pop¬ 
ulation, that is, the “design matrix” X is known. Note that while the model is technically 
linear, nonlinear terms may be incorporated through X as usual in a regression analysis. 

Suppose the goal of the experiment is to investigate which covariates do and do not 
alter total species richness, or equivalently, which elements of (3 are equal to zero. This 
type of analysis is of very broad interest and answering it comprised some component of 
each 001 pr] . In order to answer this question, we take a sample of individuals from each 
of the m populations under study. We do not assume equal sample sizes (sampling depth), 
or that every taxon in each population was observed. Because we do not assume that every 
taxon in each population was observed, we do not know Q exactly for any v. the total species 
richness is unknown for each of the populations under study. Consequently our inference 
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about P requires accounting for error in estimation of C. 

Based on each of our samples, we estimate Ci by Ci with standard error dj. A large 
number of estimators for species richness have been developed since the introduction of the 


problem into the statistical literature in 1943 12 . For a recent review, see 13 , and for a 


diversity estimator developed for the high-diversity microbial setting see 14,15 . For esti¬ 


mators based on maximum likelihood or nonlinear regression, asymptotic theory ensures the 
asymptotic normality of diversity estimates under the assumption of correct model specih- 


cation (see 14 and references therein; we also investigate this with simulations below). We 
therefore assume that, conditional on the value of Cj, the estimate Ci is normally distributed 
around Ci with standard deviation Uj, that is, 

Ci\Ci = Ci €i, 

where e* ~ A/'(0, o'f),i = 1,... ,m. Unconditionally we then have the hnal model 

Ci = Po PiXi^i -f ... -1- l3pXi^p -\- Ui-\- Cj, 


or in matrix terms 


where 


C = X/3 + U -1- F, 


e = 


r 1 


/ 


0 ■■ 

■ 0 

\ 

ei 

~ J\f 


0 

^2 ■ 

■ 0 



0, 





Cm 









V 

0 

0 ■■ 

■ t^m _ 

/ 


( 1 ) 


and U ~ A/'(0, cr^/m). Since the only available information on cTj is the standard error dj we 
substitute the latter for the former and henceforth refer only to Uj. We evaluate the effect 
of this substitution via simulation below. 
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It is important to note that the stochastic nature of the estimated total diversity arises 
both from U and e*, that is, through the inherent random variation of the Ci around X/3 
and through the random variation of Ci around Ci. Procedures that model the observed 
diversity rii as a linear function of the covariates effectively set Ci = rii but treat af = 0, 
thus treating the sample as the population and unobserved diversity as null. 

Given model ([^ there are two main hypotheses of interest. The hrst is Hq ■. a^ = 0; that 
is, the variation in the true species richnesses across the m populations is wholly attributable 
to the covariates xi,..., with no unexplained random variation. This hypothesis is often 
referred to as that of homogeneity. The alternative hypothesis of heterogeneity, Ha : ct^ > 0, 
supposes that there is more variability in the diversity estimates than can be explained by 
sampling-based variation in the estimates alone, and that some other mechanism (which 
we ascribe to the random variables Ui,... ,Um) contributes to the observed behavior of the 
estimated species richnesses. Rejection of Hq does not imply that the diversity model is 
misspecihed, merely that there is some factor which alters the true total richnesses that 
is not accounted for in the model. For instance, a predictor variable that affects species 
richness but is absent from the model may result in rejection of Hq even if the populations 
are homogeneous. However, it is also possible to include all informative predictors of richness 
and still have true heterogeneity: this conclusion implies that the species richness of the 
populations is distinct without further cause. 

The second main hypothesis of interest is iJo : /di = • • ■ = /dp = 0, or alternatively, 
that none of covariates explain the variation in richness across populations. The alternative 
hypothesis Ha is then that at least one of the covariates affects richness. If Hq is rejected 
then interest focuses on the covariates Xj that do influence richness: which Pj are nonzero 
and what are their magnitudes? The relevant null hypothesis for the case of one variable is 
then Hq : jSj = 0. Note that the usual regression interpretation of the coefficients applies 
and that f3j is the expected increase in the true diversity of any of the i populations for a 
one unit increase in Xij. 
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It remains to discuss estimation of the model parameters (3 and and implementation 
of the stated hypothesis tests. The log-likelihood of our model is 


... ,al^) = -7^Y1 


2 = 1 


+ (^i) + 


+ <^1 


Maximum likelihood (ML) is a natural choice of parameter estimation technique due to its 


many asymptotic and hnite sample optimality properties in standard settings 16,17 . How¬ 


ever, in this application the choice to use ML is non-trivial because of the boundary problem-. 


> 0. This problem was studied by Crainiceanu and Ruppert 18 , who demonstrate the 
failure of the usual likelihood ratio test asymptotics when testing = 0 against > 0. 

Fortunately, we can exploit the well-developed literature on meta-analysis to resolve these 
difficulties. Meta-analyses arise in many social and health sciences where a researcher wishes 
to pool a number of different (and often disagreeing) studies to determine the presence of 
an overall effect. Each richness estimate fulhlls the role of a study’s effect estimate, the 
standard error of the richness estimate fulhlls the role of the standard error of the effect 
estimate, and the m populations of interest rehect m different studies to be pooled. A 


comprehensive treatment of meta-analyses is given by Demidenko 19 , who discusses both 


restricted maximum likelihood algorithms and also the best choice of hypothesis test in 
this nonstandard boundary case. We note also that in species richness comparison, as with 
meta-analyses, we only know the standard error in the estimates dj and not the true standard 
deviations cTj. For this reason we base our choice of asymptotics on the results of (T^ rather 


than those of 18 . Following the theory laid out by 19 , our restricted ML procedure 


maximizes 


m 


= “2 1 ^ 
2 = 1 


In {al + af) + 


ia - dffP f 
o-H + 


In 




XiXj^ 


(.1 < + d 


and we denote the maximizing values by fd and d^. Unfortunately there does not exist closed 


















form expressions for (3 and bnt we have the following npdate procednres: 


i^s+l — 


a: 




G{a^) = tr 





where in the above and henceforth we assnme that the vector /3 inclndes the intercept 
coefficient and accordingly that xi^i = 1 for i = 1,... ,m (we retain that p represents the 
nnmber of non-intercept predictors). Robust choices of starting values /5o and (Jq are also 
given in [^. Our investigations lead us to conclude that the least squares estimate of /S 
(obtained by regressing the covariates on the observed diversity) is a reasonable value for (3o, 
and the empirical variance in the estimates Q is a reasonable value for (Jq. 

Without boundary complications, hypothesis testing for (3 falls in the standard Wald-type 
framework. Inverting second derivatives of the restricted log-likelihood gives the variance 
estimate 

Var(/3) = (X^W~^X)-\ 

where W = diag((jf -|- ..., d^ -f d^), which we use to make marginal inference about the 

effect of each predictor on species richness via the test statistic , which is distributed 

V[Var(/3)]« 

approximately A/'(0,1). The global test of Hq : (3i = ... = (3p = 0 has test statistic 




which is distributed asymptotically according to a Xp distribution (the subscript denotes the 
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omission of the intercept term). Finally, we define our Q-statistic as 


<3 = 51 


(C. - xTl3f 


i=l 


(Tt 


Under the null hypothesis of homogeneity, Q follows a distribution with m — p — 1 degrees 
of freedom. 

Improving richness estimation 

While most studies are interested in modelling diversity with covariates, it may be the case 
that the practitioner is interested in species richness of itself. As discussed previously, species 
richness estimation is known to be a difficult problem in statistics: most models that offer 
good hts to frequency count data supply large standard errors, which can be traded off at the 


expense of highly parametrized models 14,20 . Species richness estimation has hitherto been 


based on the outcomes of a single experiment in which one obtains the number of species 
observed once (the singletons), the number of species observed twice (the doubletons), and 
so forth, and uses this information to predict the number of species that were not observed. 
This prediction of unobserved richness is added to the observed richness for an estimate 
of the total richness. In the setting discussed in this paper, information is available about 
multiple experiments of a similar nature: for instance, samples of the microbial composition 
of the gut of a number of study participants. It is very likely that the gut microbiomes of the 
participants possess similar structural qualities. Therefore, given the high-variance nature of 
richness estimation of a single sample, it is desirable to take advantage of common features 
across multiple samples to reduce variability in richness estimation. This concept is often 
referred to as shrinkage estimation in the statistical literature. 

Recall that our model for estimated species richness is 


C = lS.l3 + U + e, 
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where U ~ A/'(0, cr'^Im) is the random effect term that reflects the random variation in trne 
species that is not attribntable to the covariates. Thus, each Ui constitutes a realisation of 
random variable, but because we only have access to Ci and not to Ci, we do not know 
the realisation of the random variable Ui and may only predict it. If we let U* denote our 
prediction of the random vector U, the estimate of the total richnesses given by 

(7* = X/3 + U* 


provides a lower variance estimate than that given by C. We argue that if estimates of total 
species richness are of interest, combining an estimate of /3 with a prediction of U permits 
strength in prediction to be shared across multiple samples, resulting in lower variance 
estimates of richness compared to those based simply on frequency count data. 

The question remains of how to best predict the random variable U. The most common 
approach to prediction of random effects is known as best linear unbiased prediction, or 
BLUP 


21 
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Making the appropriate substitutions to the methodology described in 23 


we have that 


w = 


a: 


o-f + cr\ 


-U-iU)- 


The estimated variance of the predictions is given in 23, p751-752] and is a function of 
the variance of /3, the variance of and the covariance between (3 and d^. Note that this 
estimate of the variance does not account for the estimated nature of d^ and af and so may 
underestimate the true sampling variability when the design is unbalanced and no replicates 
are available. Investigation of the extent of this variance underestimation and appropriate 
corrections is an ongoing topic of investigation by the authors. 


Model selection and diagnostics 

The methodology described above is sensitive to the model design X, and method of esti¬ 
mating C. Perusal of the richness estimates, and more importantly, their standard errors, is 
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essential to ensure that the model is not overht (with respect to predictors) and heterogene¬ 
ity is not falsely concluded. One exploratory approach to diagnosing for possible outliers is 
to plot the estimated richness with error bars at ±2 standard errors. For an example, see 
Figures and This technique derives from, and is limited by, the assumption that the 
estimated richness are normally distributed around the true estimated richnesses with stan¬ 
dard deviation equal to the estimated standard error. The visual diagnostic for an outlier is 
a tight interval (small estimated error), especially one centered far from the overall mean. 
While it is tempting to notice the large intervals in this type of plot, in fact these types of 
points do not exert a large influence on the model because most of the variability is captured 
in the large “local error” [af) rather than affecting the estimate of the “global error” (cx^). 

While visual diagnostics of this nature can assist with model selection and formulating 
appropriate hypotheses, graphical procedures such as this one suffer from the problem of 
simultaneous inference. For this reason, “testing” multiple 95% conhdence intervals for 
overlap has an exaggerated probability of committing a Type 1 error. For this reason we 
advocate the mixed model procedure described above, which does not require multiple testing 
corrections. We proceed to demonstrate the method with some examples. 

Results 

Simulations 

The model described above involves two major modelling assumptions: the estimate Ci is 
approximately normally distributed around Ci, and that the standard error in the estimate 
of Ci provides a good approximation to the true standard deviation of Ci around Ci. Unfor¬ 
tunately, it is rare to know the true species richness for a high diversity situation, and so we 
proceed to investigate these assumptions via simulation. The negative binomial distribution 
has a long history in modelling frequency data, and while it rarely captures the large number 
of rare species that are present in many studies of practical interest, it provides a tractable 
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option for model testing. In order to best reflect the large-unobserved diversity case, we draw 
5,000 samples from a negative binomial distribution with probability parameter 0.99 and size 
parameter 500. We then omit the zeros from our samples (these represent the unobserved 
species) and attempt to estimate the number of species that we removed based only on the 
observed species. Repeating this procedure gives us a sequence (7i,..., Cm and di,..., dm- 
We may test the assumption of normality of the estimates around C with approximately 
correct standard errors by observing the distribution of for C = 5000 and comparing 
it with a 7V(0,1) distribution. We compare the results of this procedure under two methods 
of estimating C\ breakaway, a ratio-based method developed for the high-diversity case 
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and CatchAll, a computationally intensive method that hts a large number of mixed-Poisson 
models and selects the model of best £t 20 . Figure compares the empirical distribution 
with a standard normal distribution. The sample mean of the breakaway rescaled estimates 
was —5.11 X 10“^ and the standard deviation was 1.0229. The equivalent values for CatchAll 
were —0.1459 and 1.0202. We note only slight deviation from the A/'(0,1) curve in each case 
and conclude that the assumption of normality is plausible, at least for frequency counts 
that follow an approximate negative binomial distribution. 


While the assumption of normality appears to hold, the effect of the substitution of dj for 
(Ji may affect the test of homogeneity, and so we investigate the Type 1 error rate for this 
test. Similar to the above, we mimic the data-generating process of homogeneous negative 
binomial draws, but now observe the distribution of the homogeneity test statistics, Q, and 
compare their empirical distribution to a xh-i distribution. We draw 10,000 samples and 
partition them into 500 samples of size 20 (20 “replicates”), compute the richness estimates 
and standard errors for each sample, fit an intercept-only model for each sample, and observe 
the distribution of the Q statistics. Our theory suggests that the Q statistics should be 
approximately distributed according to a Xi 9 distribution. We observe from Figure that 
the effect of the substitution dj leads to heavier distribution tails than would be expected 
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Distribution of breakaway estimates of species richness 


Distribution of CatchAII estimates of species richness 




Figure 1: Richness estimates derived from software breakaway (left) and CatchAII (right) 
appear to be approximately normally distributed with correct standard error in the negative 
binomial counts case. 

under a X 19 distribution. The empirical Type 1 error rate for breakaway estimates is 7.2%, 
and for CatchAII the error rate is 7.1%. We conclude that substituting dj for Ui leads to a 
slightly exaggerated probability of rejecting a correct hypothesis of homogeneity and suggest 
caution when a computed test statistic is only marginally signihcant. 


Case study: Gut microbiota richness is affected by antibiotics 

We now demonstrate the applicability of the above methodology in determining the effect of 
antibiotics on the gut microbiome. Dethlefsen et. al employed pyrosequencing technology 
to obtain more than 7,000 rRNA sequences for three human subjects before, during, and 
after a course of ciprofloxacin. They observed that the treatment lead to an overall decrease 
in the observed richness of the microbiota communities but were unable to test this formally. 
We formally investigate this using their dataset. 

A brief perusal of the frequency count tables suggests that we are in the medium diver¬ 


sity setting, and for this reason we use CatchAII 20 to estimate species richness for each 
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Distribution of Q statistics from breakaway 
estimates of species richness 


Distribution of Q statistics from CatchAII 
estimates of species richness 




Figure 2: Richness estimates derived from software breakaway (left) and CatchAII (right) 
appear to be approximately normally distributed with correct standard error in the negative 
binomial counts case. 

individual and each sample. We £t a model with treatment (pre-treatment, during treat¬ 
ment and post-treatment) and patient (Patient A, Patient B and Patient C) as predictors, 
referring to Figure to justify our choice of an additive model. We conclude that the model 
is highly signihcant in explaining richness {Q = 686.84,p < 0.001). In particular, we hnd 
that treatment is highly signihcant in decreasing richness {p < 0.001), reducing diversity by 
456 species on average. However, we hnd that there is no signihcant post-treatment ahect 
{p = 0.638) and that diversity recovers to pre-treatment levels after 4 weeks. These results 
concur with the general conclusions of Dethlefsen et. al j^, but we emphasize that this 
methodology provides a formal approach to testing their hypothesis. 


Case study: Biological replicates of soil samples are homogeneous 

Soil microbial communities are perhaps the most species-rich of all studied environments on 


Earth 24 . Housing complex interfaces between the hydrosphere, atmosphere, lithosphere. 


and biosphere, soils exhibit extreme microscale heterogeneity in potential microbial habitats 


15 





Patient A 


Patient B 


Patient C 



I 



4 




* 4 * 


♦ 




I 




I_I 

PRE 


♦ ♦ 

I _I I_I I_I l_J I_I I_I l_J I_I 

TR POST PRE TRPOST PRE TRPOST 


Figure 3: CatchAll richness estimates for the Dethlefsen et. al dataset and 95% Wald-type 
conhdence intervals. Estimates were computed for each of three patients and pre-treatment 
(PRE), during treatment (TR) and post-treatment (POST). 


[25[|26| , which may support the persistence of microbial species diversity 27 . The complexity 
of these communities pose considerable challenges for diversity analysis and thus provide an 
extremely interesting test case. 

Whitman et al. extracted, amplihed, and sequenced bacterial 16S DNA with soils 
from a held trial with no amendments, with pyrogenic organic matter additions, and with 
fresh biomass additions. Treatments were laid out using a spatially balanced complete block 


design as described in 29 . Sampling took place less than 24 hours after additions, after 12 
days, and after 82 days, with two samples taken less than 15 cm apart and combined for 
each plot. 

We focus on homogeneity of the biological replicates. All Day 1 samples may be con¬ 
sidered to be replicates regardless of amendment, because the amendments were not yet 
incorporated. Symmetric conhdence intervals for breakaway estimates of species richnesses 
for Day 1 samples may be seen in Figure An intercept-only model for the species rich¬ 
nesses initially rejects the null hypothesis for homogeneity {p < 0.0001). However, noting 
that the conhdence interval for richness in Sample 2 appears to be unrealistically tight and 
potentially exerting high inhuence on the model, we re-test for homogeneity with this sample 
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Figure 4: CatchAll richness estimates for the Dethlefsen et. al dataset and 95% Wald-type 
conhdence intervals. Estimates were computed for each of three patients and pre-treatment 
(PRE), during treatment (TR) and post-treatment (POST). 


excluded to hnd that the test for heterogeneity is only marginally signihcant {p = 0.0296). 
Given the conservative effect of the substitution dj discussed above, we do not reject the nnll 
hypothesis and conclude that the Day 1 replicates are in fact homogenous with respect to 
richness. 


Discussion 

The model for species richness presented above has many advantages compared to the popu¬ 
lar approach of htting a linear model for observed species richness with covariates. The most 
important development presented here is a model that acconnts for unobserved diversity, 
which is usnally ignored in formal diversity comparisons. 

While one of the advantages of the approach is its simplicity as a natural extension of 
the linear model, it shares a number of the limitations of the linear model. As a supervised 
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procedure, the practitioner must make sensible choices of predictor variables that balance 
explanatory power with parsimony. Outliers (richness estimates with small standard errors) 
may exert influence on the model and visual diagnostics should be employed to determine 
the appropriateness of the model. Failing to conhrm model assumptions and to check for 
outliers may lead to unreliable tests for signihcance, or false conclusions of heterogeneity. 
Furthermore, richness estimation is a prediction problem, and so it is not possible to im¬ 


plement informative nonparametric approaches 30 . For this reason, estimation of species 
richness requires modelling assumptions. All known models suffer from estimator instability. 


or implausibility of model £t 13,14, which must be traded off when choosing an estimator 
of species richness. There is no guarantee that conclusions based on the mixed effects model 
described above are robust to choices of richness estimator, and we advise caution in this 
regard. 

It has previously been noted in the literature that observed diversity is highly sensitive 
to sampling depth (referred to as sequencing depth in the microbial case). This has lead to 
the common practice of rarefying the data to achieve equal sample sizes for each sample. For 


an excellent discussion of why rarefying should be outlawed, see McMurdie and Holmes 31 


Note that our method proposes modelling total diversity, not observed diversity. Total diver¬ 
sity is a hxed but unknown parameter, and is hence invariant to sampling depth. Estimated 
total diversity is a function of the sample, but it is more robust to sampling depth than ob¬ 
served diversity. While we advocate additional sampling and sequencing whenever possible, 
we note that our method accounts for inexhaustive sampling and emphasize that abundances 
should not be rarehed prior to implementing the methodology. 

We believe that the greatest advantage of this method is the ability to share strength 
across samples to improve inference via the BLUP framework of the linear mixed model. 
Large numbers of rare taxa may prohibit accurate and precise estimation of total diversity 
from single samples, but pooling samples believed to have similar structure can vastly reduce 
standard errors in estimation of total diversity for each of the observed samples, even if their 
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covariate information differs. We believe that this is the hrst proposal to use community 
composition information (“beta diversity”) to improve inference about individual samples 
(“alpha diversity”). 


Conclusions 


We present the hrst approach to modelling total species richness as a function of covariates. 
Existing methods fail to account for unobserved diversity by only noting changes in observed 
diversity. We believe the strength of the approach is most pronounced in high-diversity 
environments, where unobserved diversity may dominate. We demonstrate the approach on 
two microbial datasets: a soil dataset where we conclude homogeneity of biological replicates, 
and a gut microbiome dataset where we conclude that an antibiotic signihcantly decreases 
taxonomic richness. We believe that the method is highly general and applicable to any 
situation where community abundance data is available, including microbial and animal 
abundances. The relevant software for diversity estimation and prediction is given in the R 


package “breakaway” 15 , with the methodology accessible via the function “betta”. Given 


the importance of maintaining stable ecological diversity and ongoing investigation of data 


obtained by the Human Microbiome Project 32,33 , we hope that the inferential procedure 


outlined here will inform local and global policy, especially with respect to human health 
(for example, via gut microflora stabilization) and climate change (for example, via carbon 
hxing soil treatments). 
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